attention-gated brain propagation
Attention-Gated Brain Propagation: How the brain can implement reward-based error backpropagation
Much recent work has focused on biologically plausible variants of supervised learning algorithms. However, there is no teacher in the motor cortex that instructs the motor neurons and learning in the brain depends on reward and punishment. We demonstrate a biologically plausible reinforcement learning scheme for deep networks with an arbitrary number of layers. The network chooses an action by selecting a unit in the output layer and uses feedback connections to assign credit to the units in successively lower layers that are responsible for this action. After the choice, the network receives reinforcement and there is no teacher correcting the errors.
Review for NeurIPS paper: Attention-Gated Brain Propagation: How the brain can implement reward-based error backpropagation
Additional Feedback: - To me, the fact that learning was not much slower than standard supervised learning seems like the most important result of the paper, and I would have liked to see more analysis of how this works (rather than just a report of the empirical result). Additionally it would be nice to see a more systematic exploration of how this scales with the number of classes, including greater numbers of classes. This is an important and strong statement about physiology, but I'm not sure the references support it. Many references are given, but this isn't the main topic of any of them. I looked fairly carefully for support for this statement in the first reference and didn't find it.
Review for NeurIPS paper: Attention-Gated Brain Propagation: How the brain can implement reward-based error backpropagation
The reviewers agreed that this paper provides an important contribution to the biological learning literature, and agreed that it should be accepted. However, the reviewers were also in agreement that the authors must do the following for the camera-ready version of the paper: 1) Provide greater clarity that this is an extension of AGREL and does not involve any changes to the core AGREL algorithm, but rather, a means of gating the attention signals sent back through multiple layers.
Attention-Gated Brain Propagation: How the brain can implement reward-based error backpropagation
Much recent work has focused on biologically plausible variants of supervised learning algorithms. However, there is no teacher in the motor cortex that instructs the motor neurons and learning in the brain depends on reward and punishment. We demonstrate a biologically plausible reinforcement learning scheme for deep networks with an arbitrary number of layers. The network chooses an action by selecting a unit in the output layer and uses feedback connections to assign credit to the units in successively lower layers that are responsible for this action. After the choice, the network receives reinforcement and there is no teacher correcting the errors.